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Abstract 

This  research  focused  on  representing  the  magnitude, 
phase,  and  orientation  of  each  two-dimensional  spatial 
frequency  component  of  an  input  as  a  three-dimensional 
vector.  An  extension  of  the  concept  of  phasor  analysis  was 
developed  to  facilitate  this  representation.  Using  Fourier 
analysis  theory,  any  input  can  be  considered  as  the  sum  of 
its  spatial  frequency  components.  Therefore,  the  input  could 
be  represented  by  a  three-dimensional  vector  which  was  the 
summation  of  the  vector  representations  of  the  spatial 
frequency  components. 

This  theory  was  tested  in  two  phases.  First,  isolated 
English  capital  letters  A  through  Z  were  discretized  using  an 
8x8  grid  intended  to  simulate  an  input  sensor  device.  The 
vectors  representing  these  inputs  were  examined  to  insure 
that  the  letters  were  separable  and  letters  which  appeared  as 
similar  to  human  observers  were  clustered.  Second,  twenty 
test  letters  were  used  to  test  the  capability  of  this  scheme 
in  recognizing  variants  of  the  templates.  Each  phase  was 
tested  using  unfiltered  spatial  frequency  components  and 
components  filtered  by  the  modulation  transfer  function  ( MTF ) 
of  the  human  visual  system,  and  the  vectors  were  plotted  in 
one  of  two  spaces. 

The  template  letters  were  separable  and  similar  letters 
were  clustered.  The  recognition  rate  ranged  from  40%  to  60%, 
depending  on  the  choice  of  input  filter  and  space  used  to 
plot  the  vectors. 


vi 


I. 


Many  theories  concerning  the  processing  of  information 
by  the  human  visual  system  have  been  hypothesized  in  the  past 
20  years.  One  of  these  theories,  based  on  psycho- 
physiological  studies,  is  the  size  and  orientation  (spatial 
frequency)  channel  theory  proposed  by  Campbell  and  Robson 
(Ref  6).  In  addition  to  this  theory,  the  model  developed  by 
Kabrisky  (Ref  11),  which  is  based  on  anatomical  studies  of 
the  visual  cortex,  proposed  that  the  cortical-cortical 
connections  between  the  primary  and  secondary  visual  cortex 
could  support  Fourier  or  similar  transformation  computations. 
This  theory  and  model  of  the  human  visual  system  were  the 
basis  for  many  pattern  recognition  systems  using  spatial 
frequency  analysis  which  have  been  developed  in  recent  years. 


Much  research  has  been  done  at  the  Air  Force  Institute 
of  Technology,  in  conjunction  with  the  Aerospace  Medical 
Research  Laboratory,  and  at  other  research  centers  trying  to 
emulate  the  human  visual  system,  based  on  the  premise  that 
the  human  visual  system  processes  information  and  performs 
pattern  recognition  in  the  spatial  frequency  domain.  This 
research  focused  on  developing  algorithms  which  computed 
correlations  of  low  spatial  frequency  components  of  input 
data  to  identify  English  letters  (Ref  20), the  cyrillic 
alphabet  and  objects  in  reconnaissance  photographs  (Ref  22), 


V-VA'V  • . 
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and  Chinese  characters  (Ref  1).  These  efforts  were 
successful,  and  they  demonstrated  that  the  low  spatial 
frequency  components  contained  the  essence  of  the  input  form. 

While  these  pattern  recognition  efforts  were  successful, 
much  potentially  valuable  information  was  deleted  when  the 
high  spatial  frequency  components  were  discarded.  This 
information  may  be  unneccessary  for  identifying  a  letter  or 
an  object,  but  it  is  essential  for  identifying  the  font  of 
the  letter  or  the  detail  of  the  object.  But  retaining  this 
additional  information  required  much  additional  storage  space 
if  a  correlation  of  all  of  the  spatial  frequency  components 
was  required.  Therefore,  another  method  of  retaining 
essential  spatial  frequency  information  is  needed,  since 
storage  space  is  a  prime  consideration. 

Problem 

The  objective  of  this  study  was  to  develop  a  method  for 
representing  the  magnitude,  phase,  and  orientation  of  each 
spatial  frequency  component  of  an  input  as  a  vector  in 
three-dimensional  space.  A  technique  called  phasor  analysis, 
frequently  used  to  represent  the  magnitude  and  phase  of 
one-dimensional  sinusoids  as  a  two-dimensional  vector,  was 
extended  to  allow  two-dimensional  sinusoids  to  be  represented 
Using  Fourier  analysis  theory,  an  input  scene  can  be 
considered  as  the  sum  of  its  spatial  frequency  components. 
Once  each  spatial  frequency  component  is  represented  as  a 
single  vector  in  three-dimensional  space,  then  the  entire 


input  scene  can  be  represented  by  a  vector  sum,  which  is  also 
a  single  vector  in  three-dimensional  space. 

SC.QJBfi 

The  input  data  used  as  templates  for  this  research  were 
isolated  English  capital  letters  A  through  Z,  handprinted  of 
an  8x8  grid.  The  grid  was  intended  to  simulate  an  ideal 
64-element  sensor  array,  and  each  element  was  assumed  to 
respond  linearly  to  the  radiance  of  the  correspondi  section 
of  the  input  character.  The  letters  were  drawn  to  c  pletely 
fill  the  grid,  and  were  discretized  by  visually  det  c  ning 
the  extent  that  each  element  was  filled  by  the  character. 

Each  element  was  divided  into  8  subelements,  and  each 
subelement  was  assigned  a  nominal  value  of  1.0  (arbitrary 
units).  If  all  subelements  were  filled,  the  element  was 


separate  tests,  the  modulation  transfer  function  (MTF)  of  the 
human  visual  system  was  applied  to  the  coefficients  to 
emulate  the  spatial  frequency  response  of  the  visual  system. 
In  all  tests,  the  three-dimensional  vector  sum  representing 
each  template  was  determined  in  two  different  spaces,  denoted 
Space  1  and  Space  2.  As  part  of  the  experiment,  the  template 
vectors  were  examined  to  insure  the  letters  were  separable 
and  letters  that  appear  similar  to  human  observers  were 
clustered  in  these  spaces. 

The  ability  of  this  method  to  identify  variants  of  the 
templates  was  also  tested.  Twenty  test  letters,  which  ranged 
from  minor  to  extreme  variants  of  the  templates,  were  used 
for  these  tests. 


Sequence  of  Presentation 

Background  on  a  current  theory  of  human  visual 
information  processing  that  is  based  or:  the  hypothesis  that 
the  visual  system  functions  as  a  spatial  frequency  analyzer 
is  provided  in  the  next  chapter.  In  chapter  III,  an  overview 
of  Fourier  analysis  theory,  and  a  method  for  determining  the 
orientation  of  spatial  frequency  components  are  presented.  In 
chapter  IV,  the  method  used  for  inputing  data  and  the  problem 
of  background  radiance  and  contrast  changes  are  discussed.  In 
Chapter  V,  the  experimental  procedures,  the  results  of  the 
experiment,  and  the  criteria  used  to  evaluate  the  results  are 
described.  The  conclusions  based  on  this  research  and 


recommendations  for  future  study  are  discussed  in  Chapter  VI. 


Discretized  versions  of  the  template  letters  used  in  this 
research  are  given  in  Appendix  A.  Discretized  versions  of  the 
test  letters  are  given  in  Appendix  B. 


II.  Hie  Human  Visual  System 


Human  visual  System  as  a  spatial  Frequency  Analyzer 

One  of  the  current  theories  of  human  visual  information 
processing  hypothesizes  that  the  human  visual  system 
functions  as  a  multi-channel  spatial  frequency  analyzer. 
According  to  the  theory  proposed  by  Campbell  and  Robson  (Ref 
6),  the  human  visual  system  is  composed  of  many  narrow 
bandwidth  channels,  each  tuned  to  a  different  center 
frequency.  The  modulation  transfer  function,  or  contrast 
sensitivity  of  the  human  visual  system  as  described  by 
Ginsburg  (Ref  8),  is  hypothesized  to  be  the  resultant  of  the 
combined  activity  of  these  individual  channels.  Studies  by 
Glezer  et  al.(Ref  9)  suggest  that  the  receptive  fields  of  the 
retina  and  the  corresponding  receptive  fields  of  the  primary 
visual  cortex  respond  as  narrow  bandwidth  spatial  frequency 
filters  such  as  those  suggested  by  Campbell  and  Robson.  Other 
studies  by  Maffei  and  Fiorentini  (Ref  14,15),  and  Tootell  et 
al.  (Ref  23)  suggest  that  the  primary  visual  cortex  may  be 
organized  as  a  spatial  frequency  analyzer,  with  spatial 
frequency  and  orientation  information  segmented  within  the 
substructure  of  the  primary  visual  cortex. 

These  psychological  and  psycho-physiological  experiments 
support  the  hypothesis  that  the  visual  system  extracts 
contrast  information,  catagorized  according  to  spatial 
frequency  and  orientation,  from  the  input  scene.  Exactly  how 
spatial  frequency  and  orientation  information  may  be  stored 


or  if  this  information  is  the  basis  for  pattern  recognition 
within  the  human  brain  remains  a  mystery. 


Modulation  Transfer  Function  of  the  Human  Visual  System 

Since  this  research  was  attempting  to  emulate  the  human 
visual  system  as  a  spatial  frequency  analyzer,  the  modulation 
transfer  function  ( MTF )  as  defined  by  Ginsburg  (Ref  8)  was 
used  as  a  spatial  frequency  filter  to  simulate  the  spatial 
frequency  response  of  the  human  visual  system.  In  order  to 
apply  the  MTF,  the  spatial  frequencies  fx,fy  had  to  be 
defined  in  terms  of  cycles/degree.  Therefore,  the  spatial 
frequency  spectrum  of  the  input  image  was  defined  as  having  a 
fundamental  frequency  of  1  cycle/degree,  and  the  highest 
frequency  present  was  4  cycles/degree.  Then,  the  MTF  could  be 
applied.  Ginsburg  used  both  an  MTF(H)  and  an  MTF(L),  which 
were  results  of  two  different  psycho-physiological 
experiments  by  Blakemore  and  Campbell  (Ref  2),  and  Campbell, 
Kulikowski,  and  Levison  (Ref  4),  respectively.  Therefore, 
both  the  MTF(H)  and  MTF(L)  were  used  in  this  research.  The 
first,  quadrants  of  the  MTF(H)  and  MTF(L)  in  the  range  of  1  to 
4  cycles/degree  are  shown  in  Figure  II-l. 
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Figure  II-l.  First  Quadrant  of  the  Modulation  Transfer 

Functions  MTF(H)  and  MTF(L)  of  the  Human  Visual 
System  (1  to  4  Cycles/Degree) . (Ref  8:139-140). 
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III. 


Overview  of  Fourier  Analysis 

In  order  to  analyze  the  spatial  frequency  content  of  an 
input  scene,  the  coefficients  of  the  spatial  frequency 
components  of  the  scene  must  be  determined.  Digitally,  this 
determination  is  made  by  computing  the  discrete  Fourier 
transform  (DFT),  which  is  usually  computed  using  a  Fast 
Fourier  Transform  (FFT)  algorithm.  A  detailed  discussion  of 
the  DFT  and  the  FFT  can  be  found  in  the  text  by  Oppenheim  and 
Shafer  (Ref  18).  Only  a  brief  overview  will  be  presented 
here. 

Since  an  infinite  number  of  computations  cannot  be  made, 
the  continuous  Fourier  Transform  of  an  input  cannot,  in 
general,  be  computed  directly.  Therefore,  an  approximation  is 
used  by  computing  the  Fourier  transform  of  discrete  number  of 
samples.  This  computation  is  known  as  the  DFT.  The  FFT  is 
simply  a  more  efficient  method  of  computing  the  DFT. 

To  compute  the  two-dimensional  DFT,  the  input  scene 
f(x,y),  which  is  of  infinite  extent,  must  be  redefined  as 
f'(x,y),  where  f'(x,y)  is  the  product  of  the  original  f(x,y) 
and  a  rectangular  windowing  function  of  dimension  M  samples 
by  N  samples.  Therefore,  the  function  f'(x,y)  is  defined  as: 


f ' (x,y)=f (x,y) 
=  0 


(0  £  x  £  M-ljO  £  y  £  N-l) 

( e . w.  )  ( 1 ) 
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Now  the  discrete  Fourier  coefficients  can  be  computed.  The 
computations  are  made  according  to  the  following  equations: 


A(fx,fy)=  £  2If  '  (  x  ,y ) cos (  2tt(  f  x*  x+f  y •  y )  (2a) 

V  H 

B(fx,fy)=  2Zf'  (x,y)sin(2TT(fx*x+fyy)  (2b) 

y  x 

✓ 

where  A(fx,fy)  is  the  real  part  of  the  coefficient  of  spatial 
frequency  component  fx,fy;  and,  B(fx,fy)  is  the  imaginary 
part  of  the  coefficient  of  the  spatial  frequency  component 
fx,fy.  The  scaling  factor  1/M*N  was  assumed  to  be  part  of  the 
inverse  Fourier  transform;  therefore,  it  was  neglected  here. 

However,  not  all  of  these  A(fx,fy)  and  B(fx,fy) 
coefficients  are  unique,  since  the  DFT  is  a  folded  transform 
and  the  coefficients  are  present  in  conjugate  pairs.  For 
example,  an  8x8  (64  sample)  input  has  34  unique  A(fx,fx) 
coefficients  and  30  unique  B(fx,fy)  coefficients,  with  the 
remaining  coefficients  being  either  always  zero  or  conjugates 
of  another  coefficient. 

HET  Of  a  Two-Dimensional  Input ?  Definition  q£  Unique  Fourier 
Series  Coefficients 

The  DFT  of  a  two-dimensional  input  requires  the  input  to 
be  multiplied,  sample  by  sample,  times  a  set  of  discretized 
two-dimensional  basis  functions  of  the  general  forms 
cos(  2  tr  (fx*  x+fyy)  ]  and  sin[  2  TY(fx*x+fy*y)  ] .  In  order  for  the 


orientation  of  the  particular  basis  function  to  be 
determined/  the  function  must  be  in  "fundamental"  form,  with 
Ifxl  <  M/2  and  |fy)  £  N/2.  In  compliance  with  the  Nyquist 
criteria,  M/2, N/2  is  the  maximum  spatial  frequency  that  can 
exist  in  an  M  sample  by  N  sample  input  without  aliasing. 

When  the  DFT  was  computed  using  equations  (2a)  and  (2b), 
the  "fundamental"  spatial  frequencies  were  K  i  fx  i  M/2  and 
L  i  fy  i  N/2,  where: 


K  = 

-M/2  + 

1 

for 

M 

even; 

(3a) 

K  = 

-M/2 

for 

M 

odd ; 

(3b) 

L  = 

-N/2  + 

1 

for 

N 

even;  and, 

(3c) 

L  = 

-N/2 

for 

N 

odd. 

(3d) 

It  is  necessary  to  compute  only  the  Fourier  coefficients  at 
spatial  frequencies  in  the  ranges  0  i  fx  £  M/2  and 
L  i  fy  i  N/2,  since  the  remaining  coefficients  are  the 
conjugates  of  these  coefficients. 

In  terms  of  a  discrete  Fourier  Series  ( DFS ) 
representation  of  the  input,  each  DFT  coefficient  and  its 
conjugate  are  combined  to  yield  a  single  coefficient  of  each 
sine  and  cosine  component  of  the  input.  The  DFS  coefficients, 
which  are  defined  as  unique  Fourier  Series  coefficients,  of 
an  8x8  (64  sample)  input  are  shown  in  Figure  III-1.  The 
factor  of  2  accounts  for  the  conjugate  coefficient  at  the 
corresponding  "negative"  spatial  frequency  in  a  DFT 
representation . 
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Figure  Ill-l.  Unique  DFS  Coefficients  for  an  8x8  input 


Orientation  &£  tire.  Spatial  Erequea.CY  Components 

The  orientation  of  the  spatial  frequency  components  was 
determined  by  examining  the  lines  of  zero  phase  as  outlined 
by  Goodman  (Ref  10:8-9). The  lines  of  zero  phase  indicated  the 
orientation  of  the  particular  basis  function,  i.e. 
cos[  2fT(fx*x+fy*y)  ]  and  sin[  2  TT  (  f  x*  x+f  y-  y )  ) ,  in  the  x-y 
plane.  The  orientation  angle  0  with  respect  to  the  x-axis  is 
given  by: 


0  =  arctan  (fy  /  fx)  (4) 

where  fx  and  fy  are  the  "fundamental*  spatial  frequencies  of 


the  basis  functions. 


IV.  input  Data 


Technique  lor  Discretizing  Inputs 

The  input  data  used  as  templates  for  this  experiment 
were  handprinted  English  capital  letters  A  through  Z,  drawn 
on  an  8x8  grid.  The  grid  was  intended  to  ideally  simulate  a 
64-element  sensor  array  which  may  be  used  as  an  input  device 
to  a  character  recognition  system.  The  sensor  response  was 
assumed  to  be  linear  with  respect  to  the  radiance  of  the 
input.  The  letters  were  drawn  to  completely  fill  the  grid, 
and  were  discretized  by  subdividing  each  element  of  the  grid 
into  eight  subelements  and  visually  determining  the  extent 
that  each  subelement  was  filled  by  the  character.  Minor 
adjustments  were  made  on  the  template  letters  during  the 
discretizing  process  to  maintain  symmetry.  If  all  of  the 
subelements  were  filled,  the  element  was  given  a  value  of  0. 
If  all  subelements  were  empty,  the  element  was  given  a  value 
of  8.  Twenty  test  letters  were  discretized  in  the  same 
manner.  The  template  letters  and  test  letters  are  shown  in 
Appendix  A  and  Appendix  B,  respectively. 

Uniform  Background,  and  contrast  changes 

Since  this  was  an  attempt  to  simulate  a  sensor  array 
input  device,  uniform  changes  in  contrast  and  background 
radiance  were  also  simulated.  Changes  in  background  radiance 
were  simulated  by  adjusting  the  value  of  each  subelement 


within  the  elements  of  the  grid.  If  the  background  radiance 
was  increased  by  50%,  then  each  subelement  was  assigned  the 
value  1.5.  Therefore,  if  all  of  the  subelements  were  filled, 
the  value  remained  0,  but  if  all  subelements  were  empty,  the 
value  increased  to  12. 

The  contrast  of  an  image  is  defined  by: 


C  =  Lmax  -  Lmin  /  Lmax  +  Lmin 


where  Lmax  and  Lmin  are  the  maximum  and  minimum  radiances, 
respectively.  The  contrast  remains  constant  for  changes  in 
background  radiance.  Uniform  contrast  changes  were  simulated 
by  adding  a  constant  to  the  value  of  each  element  of  the 
grid.  While  uniform  changes  in  the  contrast  had  no  effect  on 
the  value  of  the  DFT  coefficients  (other  than  changing  the 
value  of  the  0,0;  or  d.c.  term),  background  changes  had  a 
definite  effect  on  the  value  of  the  coefficients.  The  effects 


of  uniform  changes  in  background  radiance  and  contrast  on  the 
DFT  coefficients  for  the  letter  A  are  shown  in  Figure  IV-1. 
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Since  it  is  desirable  for  a  system  to  be  independent  of 
changes  in  background  radiance  and  contrast,  an  adjustment 
factor  was  necessary.  This  adjustment  factor  F  was  determined 
to  be : 

F  =  ( A ( 0 , 0 ) /M • N ) -  Lmin  (7) 

where  A(0,0)/M*N  is  the  d.c.  component,  or  average  radiance 
of  the  input,  and  Lmin  is  the  minimum  radiance  of  the  input. 

When  each  spatial  frequency  component  is  divided  by  this 
adjustment  factor,  the  DFT  becomes  independent  of  uniform 
changes  in  background  radiance  and  contrast. The  results  of 
using  the  adjustment  factor  F  on  the  DFT  coefficients  of  the 
letter  A  are  shown  in  Figure  IV-2. 

This  adjustment  factor  was  very  effective  in  eliminating 
adverse  effects  due  to  uniform  background  luminance  and 
contrast  changes.  The  adjusted  DFT  coefficients  were  found  to 
remain  precise  to  two  decimal  places  for  uniform  changes  in 
the  values  of  background  luminance  (  Lmax  )  ranging  from  0.04 
to  80,000;  and  for  uniform  contrast  changes  ranging  from  1 
(maximum  contrast)  to  0.04.  The  errors  at  the  limits  were 
attributable  to  rounding  errors  in  the  sine  and  cosine  basis 
functions  used  for  computation  of  the  DFT  coefficients. 
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Figure  IV-2.  Magnitudes  of  the  DFT  Coefficients  of  Letter  A 
with  Adjustment  Factor  Applied. 
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Sinusoids  are  often  represented  as  vectors  in  the 
complex  number  plane  in  a  technique  known  as  phasor  analysis. 
In  phasor  analysis,  the  coefficient  of  a  sinusoid  is  plotted 
as  a  vector  of  given  magnitude  and  phase.  Representing 
sinusoids  in  this  manner  reduces  the  addition  and  subtraction 
of  sinusoids  to  simple  algebraic  manipulations  of  complex 
numbers . 

Since  the  unique  Fourier  Series  coefficients  as  defined 
previously  are  coefficients  of  sinusoids,  they  can  be  used  to 
represent  the  spatial  frequency  component  as  a  vector.  The 
complex  plane  used  to  plot  the  magnitude  and  phase  of  each 
two-dimensional  Fourier  Series  coefficients  was  oriented  at 
the  orientation  angle  0  equivalent  to  the  orientation  of  the 
basis  function  of  the  coefficient  being  plotted.  Two  complex 
spaces,  designated  Space  1  and  Space  2,  were  used  to  plot  the 
two-dimensional  Fourier  Series  coefficients  as  three- 
dimensional  phasors. 

In  Space  1,  the  unique  A(fx,fy)  coefficient  was  plotted 
in  the  x-y  plane  at  an  angle  0,  equal  to  the  orientation 
angle,  from  the  x-axis.  The  corresponding  B(fx,fy) 
coefficient  was  plotted  in  the  +z  direction,  orthogonal  to 
the  A(fx,fy)  coefficient.  This  yielded  a  single  vector, 
{A(fx,fy)cos  0,  A(fx,fy)sin  0,B(fx,fy)},  which  represented 
the  component  at  two-dimensional  spatial  frequency  fx,fy.  The 


e 


A  (  f  X  /  0  ) 


B{fx,fy) 


'  A( 0, f y ) 

Figure  V-l.  Quadrant  I  of  Space  1. 


A  (  f  x ,  0  ) 


input  scene  f'(x,y)  was  represented  by  a  vector  sum  of  its 
unique  spatial  frequency  components.  A  pictorial 
representation  of  Quadrant  I  of  Space  1  is  given  in  Figure 
V-l. 

In  space  2,  the  coefficients  were  plotted  differently.  A 
coefficient  and  its  conjugate,  computed  using  the  DFT,  at  a 
particular  spatial  frequency  pair  fx,fy  and  -fx,-fy, 
respectively,  were  plotted  such  that  they  summed  vectorally, 
to  yield  the  unique  Fourier  Series  coefficient  defined 
previously.  For  this  to  be  accomplished,  each  quadrant  of  the 
space,  which  corresponded  with  a  quadrant  of  the  DFT,  was 
defined  differently.  The  quadrants  of  Space  2  are  pictured  in 
Figure  V-2. 
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Figure  V-2.  Quadrants  of  Space  2. 


In  quadrant  I,  the  positive  A( f x , fy ; f x20 , f y>0 ) 
coefficients  were  plotted  radially  outward,  and  the  positive 
B( f x, fy ; f x£0 ,fy>0 )  were  plotted  in  the  +z  direction.  In 
quadrant  II,  the  positive  A(  f x , f y ; f x>0 , f y£0 )  were  plotted 
radially  outward,  and  the  positive  B( f x , f y ; f x>0 , f y£0 ) 
coefficients  were  plotted  in  the  -z  direction.  In  quadrant 

III,  the  positive  A( f x , f y ; f x<0 , f y<0 )  coefficients  were 
plotted  radially  inward,  and  the  B ( f x , f y ; f x£0 , fy<0 ) 
coefficients  were  plotted  in  the  -z  direction.  In  quadrant 

IV,  the  positive  A( f x , f y ; f x<0 , f y£0 )  coefficients  were  plotted 
radially  inward,  and  the  positive  B( f x , f y ; f x<0 , f y20 ) 
coefficients  were  plotted  in  the  +z  direction. 


lasting  Methods 


The  hypothesis  that  the  essential  spatial  frequency 
information  can  be  retained  in  three-dimensional  space  was 
tested  in  two  phases.  In  the  first  phase,  six  sets  of 
templates  were  computed  using  discretized  capital  English 
letters  A  through  Z.  The  template  vector  sets  were  computed 
using  the  unfiltered  Fourier  Series  coefficients  plotted  in 
Space  1  and  Space  2,  the  Fourier  Series  coefficients  filtered 
by  MTF(H)  and  plotted  in  Space  1  and  Space  2,  and  the  Fourier 
Series  coefficients  filtered  by  MTF(L)  and  plotted  in  Space  1 
and  Space  2.  The  template  vector  sets  were  examined  to  see 
that  l)the  letters  were  separable  in  three-dimensional  space; 
and  2)  letters  which  human  observers  perceive  as  similar  were 
clustered  together. 

In  the  second  phase,  twenty  test  letter  vectors  were 
computed  using  methods  corresponding  to  stored  template 
vectors.  The  test  letter  vectors  were  then  compared  to  the 
stored  template  vectors,  using  Euclidean  distance  as  a 
metric. 

Results 

The  template  vectors  for  the  handprinted  English  letters 
are  given  in  Figures  V-3(a)-(f).  The  letters  were  separable 
in  this  scheme,  and  the  letters  which  looked  most  alike  to 
human  observers  (C,G,0,and  Q;  P  and  R;  H  and  N)  were 
clustered  together  in  all  template  sets. 
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Figure  V-3(c).  Coordinates  and  Clustering  of  MTF ( H ) -Filter ed 
Template  Vectors  in  Space  1. 
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b)  Clustering  by  Sectors 
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Figure  V-3(d).  Coordinates  and  Clustering  of  MTF(H)-Piltered 
Template  Vectors  in  Space  2. 
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b)  Clustering  by  Sectors 


Figure  V-3(e).  Coordinates  and  Clustering  of  MTF(L)-Filtered 
Template  Vectors  in  Space  1. 


Next,  the  test  letter  vectors  were  compared  with  the 
stored  template  vectors.  The  results  of  these  test  are  given 
in  Figures  V-4(a)-(f).  The  criteria  used  to  determine  the 
best  results  involved  looking  for  not  only  the  scheme  with 
the  most  correct  identifications  of  the  test  letters,  but 
also  looking  for  the  scheme  with  the  lowest  rank  total  (sum 
of  the  positions  1-26  of  the  correct  identity  of  the  test 
letters).  In  addition,  the  number  of  times  the  correct 
identity  of  the  letter  was  first  or  second  alternative  was 
examined.  This  may  be  useful  if  this  character  recognition 
technique  was  used  in  conjunction  with  a  tree-organized  word 
recognizer . 

The  best  results  were  achieved  using  no  filtering 
and  plotting  the  coefficients  in  Space  2.  Using  this  scheme, 
60%  recognition  was  achieved  (75%  of  the  letters  were  at 
least  the  second  alternative),  and  most  of  the  substitutions 
made  appeared  to  be  reasonable  substitutions  (Y  for  H;  F  for 
P;  Z  for  I;  J  for  0;  C  for  G).  Only  two  of  the  substitutions 
0  for  T  and  X  for  L  did  not  appear  to  be  reasonable 
substitutions.  The  remaining  methods  achieved  40%  to  55% 
recognition. 
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*  This  letter  was  identified  as  either  H  or  A  by  human 

observers;  ranking  was  based  on  position  of  either  H  or  A. 


Figure  V-4(a).  Test  Results  Using  Unfiltered  DFS  Coefficients 
Plotted  in  Space  1. 
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*  This  letter  was  identified  as  either  H  or  A  by  human 

observers;  ranking  was  based  on  position  of  either  H  or  A. 


Figure  V-4(b).  Test  Results  Using  Unfiltered  DFS  Coefficients 
Plotted  in  Space  2. 
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*  This  letter  was  identified  as  either  H  or  A  by  human 

observers;  ranking  was  based  on  position  of  either  H  or  A. 


Figure  V-4(c).  Test  Results  Using  MTF (H)-Filtered  DFS 
Coefficients  Plotted  in  Space  1. 
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*  This  letter  was  identified  as  either  H  or  A  by  human 

observers;  ranking  was  based  on  position  of  either  H  or  A, 


Figure  V-4(e).  Test  Results  Using  MTF (L)-Filtered  DFS 
Coefficients  Plotted  in  Space  1. 
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This  research  exhibited  a  method  to  represent  the 


spatial  frequency  content  of  an  input  as  a  three-dimensional 


vector  representing  the  summation  of  the  magnitude,  phase, 


and  orientation  of  the  components.  This  vector  was  composed 


of  the  form  of  the  input,  contained  in  the  low  spatial 


frequency  components,  and  the  detail  of  the  input,  contained 


in  the  high  spatial  frequency  components.  Since  only  three 


dimensions  were  required  to  describe  any  input,  templates 


could  be  stored  efficiently,  and  distance  computations  were 


very  simple.  In  contrast,  other  methods  of  pattern 


recognition  utilizing  spatial  frequency  components  required 


the  storage  and  correlation  of  49  Fourier  components  and 


sacrificed  the  detail  of  the  input  by  discarding  the  high 


spatial  frequency  components, 


This  research,  while  showing  that  phasor  analysis  in 


three-dimensions  was  possible,  did  have  five  definite 


shortcomings.  First,  the  spatial  frequency  spectrum  of  the 


input  was  severely  bandlimited,  due  the  low  spatial 


resolution  associated  with  the  simulated  input  sensor.  In  the 


human  visual  system,  the  spatial  frequency  spectrum  of  an 


input  ranges  from  approximately  0.2  cycles/degree  to 


approximately  50  cycles/degree ,  based  on  the  filtering 


properties  ( MTF )  of  the  system.  In  this  research,  the  spatial 


frequency  spectrum  ranged  from  1  cycle/degree  to 


4  cycles/degree.  Even  with  this  severe  bandl  imi  t.  i  ng , 
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60  percent  recognition  was  achieved.  A  higher  resolution 
input  technique  should  significantly  improve  the  recognition 
rate.  Second,  pure  sine  and  cosine  inputs  of  differing 
spatial  frequencies  but  of  the  same  magnitude  and  orientation 
had  identical  vector  representations  when  the  components  were 
unfiltered.  Filtering  of  the  spatial  frequencies  components 
using  MTF(H)  and  MTF(L)  did  alleviate  this  problem,  since 
each  component  was  scaled  differently.  Another  solution  may 
be  to  use  trigonometric  identities  to  allow  the  spatial 
frequency  components  to  be  redefined  in  terms  of  the 
fundamental  spatial  frequencies.  Third,  only  26  template 
letters  were  used  in  this  research.  In  the  human  visual 
system,  hundreds  of  templates,  or  sets  of  template  areas 
exist  for  the  characters  used  as  templates  in  this  research. 
By  employing  more  templates,  statistical  analysis  could  be 
used  for  pattern  recognition.  Fourth,  noise  effects  were 
neglected.  It  is  suspected  that  if  the  MTF  of  the  human 
visual  system  is  used,  the  noise  performance  of  the 
vector-based  system  would  be  comparable  to  that  of  the  human 
visual  system.  Finally,  the  effects  of  size  variance  were  not 
examined.  These  are  areas  on  which  future  research  should  be 
focused . 

The  concept  of  phasor  representation  of  sinusoids  and 
its  applications  to  represent  the  spatial  frequency  content 
of  an  input  appears  to  be  a  viable  technique  to  reduce  the 
dimensionality  required  to  describe  an  input.  Further 
research  is  necessary  to  determine  the  full  impact  of  this 
technique  in  the  area  of  pattern  recognition. 
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APPENDIX  A 


The  template  letters  used  in  this  research  are  shown 
in  numerical  and  gray  scale  representation  in  Figure  A-2. 
The  gray  scale  used  for  the  representations  is  given  in 
Figure  A-l,  and  ranges  from  0  (black)  to  8  (white).  To 
observe  the  representations  with  the  fundamental  spatial 
frequency  equal  to  1  cycle/degree,  the  letters  should  be 
viewed  at  a  distance  of  approximately  1.4  meters.  The 
templates  are  discretized  versions  of  handprinted 
English  capital  letters  A-Z,  with  symmetry  added  during 
the  discretization. 
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Figure  A-2.  Letters  Used  as  Templates 
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Figure  A-2.  Letters  Used  as  Templates 
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APPENDIX  3 


The  test  letter s  used  in  this  research  .are  shown 
in  numerical  and  gray  scale  representation  in  Figure  3-2. 

The  gray  scale  used  for  the  representations  is  given  in 
Figure  3-1,  and  ranges  from  0  (black)  to  8  (white).  To 
observe  the  representations  with  the  fundamental  spatial 
frequency  equal  to  1  cycle/degree,  the  letters  should  be 
viewed  at  a  distance  of  approximately  1.4  meters. 

The  test  letters  are  variants  of  the  templates  given  in 
Appendix  A.  The  test  letters  consist  of  an  A-to-H 
transformation  (Letter  1  through  5),  10  handprinted  letters 
without  symmetry  adjustment,  and  5  miscellaneous  variants. 
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